Academic medicine’s glass ceiling: Author’s gender in top three medical research journals impacts probability of future publication success

Introduction In December 2017, Lancet called for gender inequality investigations. Holding other factors constant, trends over time for significant author (i.e., first, second, last or any of these authors) publications were examined for the three highest-impact medical research journals (i.e., New England Journal of Medicine [NEJM], Journal of the American Medical Association [JAMA], and Lancet). Materials and methods Using randomly sampled 2002-2019 MEDLINE original publications (n = 1,080; 20/year/journal), significant author-based and publication-based characteristics were extracted. Gender assignment used internet-based biographies, pronouns, first names, and photographs. Adjusting for author-specific characteristics and multiple publications per author, generalized estimating equations tested for first, second, and last significant author gender disparities. Results Compared to 37.23% of 2002 – 2019 U.S. medical school full-time faculty that were women, women’s first author publication rates (26.82% overall, 15.83% NEJM, 29.38% Lancet, and 35.39% JAMA; all p < 0.0001) were lower. No improvements over time occurred in women first authorship rates. Women first authors had lower Web of Science citation counts and co-authors/collaborating author counts, less frequently held M.D. or multiple doctoral-level degrees, less commonly published clinical trials or cardiovascular-related projects, but more commonly were North American-based and studied North American-based patients (all p < 0.05). Women second and last authors were similarly underrepresented. Compared to men, women first authors had lower multiple publication rates in these top journals (p < 0.001). Same gender first/last authors resulted in higher multiple publication rates within these top three journals (p < 0.001). Discussion Since 2002, this authorship “gender disparity chasm” has been tolerated across all these top medical research journals. Despite Lancet’s 2017 call to arms, furthermore, the author-based gender disparities have not changed for these top medical research journals - even in recent times. Co-author gender alignment may reduce future gender inequities, but this promising strategy requires further investigation.


Introduction
In North America and much of the industrialized world, top medical research journals' findings wield significant influence over the clinical practice of medicine. Professional medical societies often utilize these top medical journals' findings to support their clinical practice guideline recommendations. Within disciplines, the top medical research journals are ranked based upon journal impact factor. In academic medicine, moreover, publications weighted by impact factor are combined with grant funding and metrics such as h-index to evaluate scholarly performance [1].
Gender disparities continue to persist within academic medicine [2][3][4][5][6][7]. Although the number of women physicians has steadily increased over the last few decades, a recent study found that women were less likely to hold the rank of Associate Professor or Full Professor, and were less likely to serve as Department Chairs than men [7]. For promotion, the time between rank levels was longer for women than men [7][8][9].
Gender disparities in research productivity have been documented [10][11][12][13][14][15][16][17][18] and reported across different medical specialties; orthopedic surgery (5.3%) and interventional cardiology (8.4%) have the lowest proportion of women authors based on published contributions in these respective fields. Multiple studies in specialty-specific fields (i.e., Alzheimer's research) have also identified that women are less likely to publish in high impact factor journals and less likely to have their work cited [15][16][17][18]. Evidence reported in field of economics has documented that female-authored papers take longer in peer-review and may be held to a higher standard than male-authored papers [19][20]. Another study showed that over 2,898 papers published with more than one author sharing the first author position between 1995 and 2017 in one of a number of different biomedical journals, male authors working with a female cofirst author were more likely to be named first, suggesting factors other than alphabetical ordering were at play in some cases [21]. As a faculty member's publication rate in top medical journals, particularly in first, second, or last author positions, may influence their institution's Academic Promotion and Tenure Committee's decisions, these publications may impact researchers' salary, job prospects, and competitiveness for grant funding. In academic medicine, top medical research journals' publications represent a career advancement goal -a "holy grail" to which many faculty members aspire.
In December 2017, Lancet called for investigations of gender-based inequalities and affirmed their commitment to gender equity in their publication practices [22]. To address this knowledge gap, the three highest-impact medical research journals (i.e., The New England Journal of Medicine [NEJM], The Journal of the American Medical Association [JAMA], and The Lancet) were evaluated for their first, second, and last author-based and publication-based characteristics. These endpoints were chosen, because author line position on biomedical research manuscripts is typically not random or alphabetical in US and many European-based journals. That is, authors are typically listed in decreasing order of contribution, wherein the first or "lead" author is usually the individual who contributed the most to the paper. The exception to this is the last or "senior" author position, which is usually filled a more experienced researcher who provides mentorship and some degree of leadership for the project.
This study's goal was to identify if gender disparities exist across the three top medical research journal publications' significant authorship roles while holding constant all other author-specific and publication-specific factors. Significant author roles were identified based on first, second, or last co-author positions held for a publication listed within the MEDLINE database. To evaluate for gender disparity trends over time, moreover, time periods included "early" (2002 -2008); "mid-" (2009 -2014), and "late" (2015 -2019) time periods. With the primary endpoint focused upon first author gender disparities, the study's two primary null hypotheses included: • First Author Single Top Medical Research Journal Publications: Across the three top medical research journals (i.e., NEJM, JAMA, and Lancet) and overall, no differences exist in the publication rates for women versus men first authors; additionally, no trends over time periods would be found for first author gender disparities.
• First Author Multiple Top Medical Research Journal Publications: Across the three top medical research journals (i.e., NEJM, JAMA, and Lancet) and overall, no differences exist in the single versus multiple publication rates for women versus men first authors. As a sub-analysis, the impact of gender-concordance among significant authors (i.e., between first and last authors or all three significant co-author team members) was proactively planned.

Study population
As a bibliometric database analysis of the three top medical research journals, this retrospective, cohort study evaluated trends over time in gender disparities for original research articles, across all journals as well as within each journal. Although MEDLINE records from January 1, 2002 to December 31, 2019 for all three top medical research journals were pulled, only publications classified as original research articles were retained. To focus on reports of original scientific investigations, publications without a structured abstract were also removed. All MEDLINE data elements for each original research article were extracted including journal title, publication date, the publications' Medical Subject Headings (MeSH), clinical trial design, grant funding support, and coauthor counts. Collaborating author counts were gathered for publications published as of 2008 or later, as that was the first year that MEDLINE began consistently reporting collaborating authors as a separate data field [23]. For each original article, moreover, the details for all significant (i.e., first, second, and last) authors were identified, clearly documenting each's co-author order in these MEDLINE records.

Study endpoints
For specific author roles (i.e., first, second, or last), a "gender-disparity" was defined when a statistically significant difference was found for the proportion of women versus men authors. Although the primary hypothesis focused on women in first author roles, the secondary study hypotheses focused upon women in second, last, or any significant (i.e., first, second or last) author roles.
As a primary endpoint, the proportion of women first authors that had a subsequent first author publication or subsequent publication in any other significant author role was compared to the proportion of men achieving this same outcome. Holding other author and publication characteristics constant, the impact of the first author's gender and their team's gender alignment (i.e., either first and last authors with the same gender or all team members having the same gender) was assessed. To identify future opportunities to reduce any top medical research journal publications' gender-related disparities, exploratory analyses were conducted.

Author characteristics
To assess the primary study variable of interest -gender, information was extracted from biographies, curricula vitae, pronouns, first names, and gender-assessment of authors' photographs as posted on institutional or other affiliated websites. Whenever possible, an author's gender was determined based on their own self-identification (i.e., based on the pronouns used on web sites, news articles, press releases, professional society announcements, resumes, etc.); when this was not possible, however, gender was assigned based on study team's consensus, following a discussion weighing all available information to assess that author's gender. Though multiple-gender individuals may have been represented in this sample, no specific indications of multiple-gender authors were found during this study's data collection process. Pragmatically, gender was assigned as a binary (woman/man) characteristic. In cases where a consensus was not reached, the gender data field was designated as "unknown." Although evaluating trends over time in author-based gender disparities were this study's primary focus, additional author-based and publication-based characteristics were extracted. Author-based characteristics included academic degrees, titles held, academic rank, leadership roles, specialty area of expertise, and their institutional affiliation at the time of publication. Additionally, the concordance between each author's specialty training and their publication's MeSH classifications were assessed; these comparisons were performed for the three most common MeSH classifications (cardiovascular diseases, infectious disease, and neoplasms). Publication-based characteristics included study population, design details, endpoints, and directionality of study findings. Following initial determinations of all author-related and publication-related characteristics, an independent audit was performed by two individuals of 90 records randomly selected across time with 30 records per top medical research journal. For audited records, the inter-rater reliability of web-based data extracted was calculated using kappa statistics for dichotomous variables (i.e., gender). For all author-related characteristics, including gender, extremely high inter-rater reliability was documented. Detailed reports for this study's data capture audit findings are provided in S1 Appendix.

Original sample size calculation
To calculate study's sample size required for more detailed author-specific data capture, the primary hypotheses of gender disparities across three top medical journals were planned to be tested using chi-square tests with pre-established type I error = 0.05 and power = 90%. Based on pilot study data records captured, effect sizes for a two-way probability table corresponding to the alternative hypothesis in the chi-squared test of association in two-way contingency tables were initially estimated at 0.16 (first authors), 0.076 (second authors), 0.03 (last authors), and 0.08 (any significant author roles). To detect first author gender disparities, it was estimated that at least 501 unique first authors would be required using the function pwr.chisq. test() in R package "pwr" (R Foundation for Statistical Computing, Vienna, Austria.). Based on the preliminary estimated effect sizes for second, last, and any significant authors, the corresponding pre-adjustment estimates of unique authors required were 2,204, 13,189, and 2,016, respectively. To evaluate trends over the three study time periods, an additional 50% inflation factor was applied, raising the initial estimate from 501 to 751 unique first authors. Based on a recent publication, 25% of first authors were estimated to have multiple publications [24]. Assuming up to a 10% unknown first author gender rate, a total of 1,033 first author publications were required. Correspondingly, the inflated sample sizes estimated were 4,545 second author publications, 27,202 last author publications, and 4,158 any significant author publications to detect gender differences.
Given the study's primary focus placed on first author gender disparities, 20 articles per journal per year were randomly selected by journal for each year from 2002 to 2019, yielding 1,080 publications (i.e., 20 publications per year per journal = 360 per journal). Therefore, the ability to detect gender differences for second, last, and any significant author was deemed to be highly unlikely. Hence, this study's main focus was placed upon detecting first author gender disparities.

Comparison to U.S. medical school full time faculty
The American Association of Medical Colleges' (AAMC) FAMOUS database [25] (Table 9) was used (accessed by the Dean's Office staff at Stony Brook Medicine on May 24, 2021) to evaluate the proportion of women serving as faculty members by rank and in total for the entire study period from 2002 to 2019 for all US-based medical schools. The year-by-year and overall rates of women for US-based academic medical schools were calculated and compared to author position-based (i.e., first authors) publication rates.

Statistical analyses
All authors or faculty with unknown gender were removed from all analyses performed; however, missing rates were reported for an independent assessment of data completeness. Generalized estimating equation (GEE) models clustered first, second, and last authors to compare the rates of publications by women versus men within or across each journal. Similar models were used to compare publication-level characteristics (i.e., time period, co-author count, collaborating author count since 2008, clinical trial, grant funding, standardized Web of Science [WOS] citation count, directionality, and study population's geographic location) between publications with women vs. men for first, second, and last author positions. Chi-square tests (with exact p-values from Monte-Carlo simulation if small cell counts existed) were used to compare author-level characteristics (i.e., specialty, degree, leadership position, academic rank, author's institutional geographic location, and concordance of author's specialty designation with the article's MeSH classifications) between women vs. men authors in first, second, last, or any significant author role. Time-based comparisons were performed across years (i.e., time trend analyses) and across the three study time periods (i.e., early, mid-, and late). With authors and publications used as clustering effects, GEE models also examined trends over time (using year-by-year comparisons) for women publication rates in first, second, last, or any significant author roles over years. As appropriate, Fisher's exact tests were used to compare multiple publication rates between women vs. men in first and last author roles within/ across journals, as well as within each journal. For clarity, the statistical tests used are noted beneath each table.
Chi square tests were used to compare multiple publication rates among first authors who had a same-gender last author, as well as for gender team alignment. Multivariable logistic regressions were performed to identify the factors predictive of female first authors and first/ last author gender alignment; for these, model eligible variables included other publication and author characteristics with a bivariate screening association p-value � 0.10.
For all comparisons, the missing data details for each author-based and publication-based characteristic were reported; however, missing values were excluded from all p-value calculations. As statistical significance thresholds, a p-value of � 0.05 was used to identify differences. Above this threshold, slightly higher p-values (i.e., up to p � 0.15) identified trends to support future research [26]. In all cases, however, actual p-values are reported to facilitate independent interpretations. All statistical analysis was performed using SAS 9.4 (SAS Institute Inc., Cary, NC).

Sample generalizability
Across study time periods, the 10,436 records were compared to the 1,080 records sampled; there was no significant sampling bias found, based on the random abstraction of 20 articles per year per journal. Detailed generalizability reports comparing the publication-based characteristics for the MEDLINE data fields extracted between the sampled (n = 1,080) versus nonsampled (n = 9,356) top medical research journal records' characteristics are provided in S2 Appendix; these reports provide assurance that this study's sampling process was robust.

Gender disparities
Overall and for each top medical research journal, gender disparities were found for first, second, and last author roles overall, and across all three top medical research journals. The proportion of women in first author roles was 26.82% overall, with significant bi-variate variations comparing women vs. men author rates among the three journals. When women first authors were analyzed separately, the proportion of women across these three journals varied, ranging from 15.83% in NEJM, to 29.38% in Lancet, and 35.39% in JAMA. Across all three journals, women first authors rates were lower than for men (p < 0.0001). Overall, rate for women serving in any significant author role (i.e., first, second, or last author roles) were lower than for men (p < 0.0001). See Table 1.
These women first authors' patterns were stable across time periods studied: early time period   leadership positions (i.e., Program Director, Division Chief, Department Chair, or Dean; p = 0.0001). Women first authors less commonly published clinical trials as compared to observational study designs (p < 0.001), and their projects were more frequently focused on infectious disease topics in contrast to men, whose projects most often focused on cardiovascular topics (p < 0.001). Women first authors papers trended towards more commonly having grant funding (p = 0.065) and trended towards more commonly reporting negative study findings (p = 0.063). See Tables 2-4 for gender-based comparison of women first authors' publication-based characteristics; as collaborating authors could only be identified starting in 2008, this analysis (Table 3) reports a reduced number of records.

PLOS ONE
Academic medicine's glass ceiling: Authorship gender disparities in top three medical journals For 34.89% of top medical research journals' publications, women were second authors. A gender disparity was found in overall journals, as well as within each journal (27.32% in NEJM, 34.12% Lancet, and 43.81% JAMA, all p < 0.01). See Table 5.
The overall rate of women last authors was 18.60%; this varied from 15.08% in NEJM, 19.83% in Lancet, and 20.96% in JAMA (all p < 0.001). Differences in women publication rates were most dramatic for last author roles. See Table 6.
To review the detailed journal-specific variations in women first author-based publication and author characteristics by gender, please see S4-1 Table in S4 Appendix for more details.
Moreover, the overall and journal-specific variations in the publication characteristics were compared between publications that had at least one woman in any significant author role, versus no women in any significant author roles, please see S4-3 and S4-4 Tables in S4 Appendix for more details.

Multiple publication rates
Based on bivariate comparisons, 2.88% of women first authors had multiple journal publications as compared to 13.35% of men first authors (p < 0.001). Interestingly, the multiple  The subsequent multiple publication rate for women as last authors was 3.68%; this rate trended lower than that for men last authors at 6.73% (p = 0.131). Although, women last authors' multiple publication rates varied across journals; however, these rates were no different than men last author rates' for multiple publications in JAMA (4.23% versus 3.37%; p = 0.721) or Lancet (2.99% versus 4.18%; p = 1.000); however, women last author rates trended lower for NEJM (0.00% versus 6.03%; p = 0.086). See Table 9.

Gender concordance
Exploring the concept of author team's gender alignment, the multiple publication rate was compared for first/last authors with the same gender [i.e., either (woman + woman) vs. (man + man)]. If either first or last author's gender was unknown, this matched pair was excluded from consideration in this analysis. In evaluating this study's metric of success (that is, a first author having multiple first author publications in top medical research journals), there were 17.29% (n = 107/619) of first authors with multiple publications that had same gender � : P-values were used to examine whether the proportion of women was 50%, i.e., whether the proportions of women and men were the same. �� : P-values were used to examine whether the gender disparities were similar across three journals. https://doi.org/10.1371/journal.pone.0261209.t005

PLOS ONE
Academic medicine's glass ceiling: Authorship gender disparities in top three medical journals alignment, compared to 8.95% (n = 28/313) that did not have gender alignment, p < 0.001. See Table 10. At the publication level, there were 1,050 publications with both the first and last authors' gender identified. The publication characteristics associated with same gender teams of first/ last authors are presented in Tables 11 and 12. Note: Missing odds ratios were due to zero cell counts in combined categories, which will lead to zero or infinite odds ratios.

Author gender comparisons with United States full time medical school faculty
For each author role, the proportion of women authors within these top medical research journals was compared to the American Association of Medical Colleges (AAMC) annual reports documenting the proportion of women holding full-time faculty positions at United States (U. S.) medical schools from 2002 to 2019 [25]. For the first author, last author, and any significant author top medical research journal authorship positions, substantial overall differences in women's representation were documented (p < 0.05) with year-by-year variations found.
Although not reaching statistical significance, a trend towards representation differences was

PLOS ONE
found for women publishing in second author roles (p = 0.121). The detailed AAMC annual comparisons by author's gender by year, as well as an overall comparison are provided in Fig 6. This data can also be found in tabular form with p-values in S3 Appendix. Additional analyses can be found in S4 Appendix.

Limitations
As with any observation database analysis, there were several limitations to this bibliometrics study. All non-MEDLINE data for author or publication-related characteristics, including gender, were collected by this study's team members using Internet searches. Unfortunately, no internal journal editorial office-based author databases (i.e., author-specific demographic data) were available to support this study. Moreover, authors were not contacted to verify their information recorded. As there could have been unconscious bias in the gender determinations (i.e., when self-reported gender was not available), an independent audit was performed. All inter-rater reliability assessments' kappa statistics were above 0.6 (acceptable concordance), except for the leadership and academic rank variable which had a kappa = 0.5276; this may be, in part, due to changes in an author's Internet-based information (i.e., changes due to an academic promotion) that occurred between the time of original data extraction (April 2019) and final data verification and inter-rater reliability assessment (January 2021). Inter-rater reliability was extremely high for the gender variable, as the key variable of study (kappa = 0.9204); please see S1 Appendix for audit findings.
The missing data rate was very low, with only 46 unique significant authors (~1.6% of significant authors) across 40 top medical research journals' publications (3.7% of all publications) for whom gender could not be assessed. The missing gender data appeared to be randomly distributed across publication-year and the three significant authors roles. As noted above, this study's sample was designed to detect gender disparities in publication rates for first authors; with the planned 1,080 sampling of the top three medical research journals' publications, the ability to detect gender disparities in second, last, or any significant author roles was known in advance to be limited.
Overall, the sampled versus non-sampled records appeared similar (see generalizability findings provided in S2 Appendix). Although all other factors were well-balanced and without statistically significant differences, sampled versus non-sampled records did have higher clinical trial rates (59.26% to 54.56%, p = 0.0033), higher co-author counts (18.70% versus 16.41% in the 21+ co-author category, p = 0.0011), and (for the period from 2008 forward when this information was available in MEDLINE) higher collaborating author counts (17.36% versus 12.66% of the sample were more frequently in the higher (101+) category; p = 0.0004). Given these minor differences, this study's findings for gender disparities should be verified for other journals (i.e., medical specialty journals).
As the primary editorial offices for the three journals considered here were based in the United Stated and the United Kingdom, these findings may not adequately describe genderbased publication disparities in other parts of the world. These findings may be more representative of women authors working in institutions located within higher income nations (i.e., North America) versus lower/middle income nations, and additional research is needed to assess for global patterns involving women authors.

Discussion
Women scientists have historically been underrepresented as authors in top medical research journals. Through increased awareness and calls to action, this is changing, but inequality persists. This study demonstrates that not only are women underrepresented as first, second, and last authors in high-impact journals, there is significant variation in the representation of women scientists amongst these journals.
Publication in high-impact medical research journals is often more than a personal achievement; it is a marker of professional success and future academic potential. These top medical research journal-related achievements are used to inform decisions for future academic promotions, grant funding, and appointment to leadership positions. Unfortunately, the differences in women versus men top medical research journals' publication outcomes identified here may likely serve to perpetuate the current gender inequalities found throughout medicine.
Further, these data suggest that women are far less likely than men to successfully publish in top medical research journals multiple times. With no trends observed over time, this disparity was observed in all three journals investigated.
In general, women had smaller co-author and collaborating author teams with lower WOS citation counts indicating that lower impact projects may have been published. This is consistent with recent work in the field of economics that showed women to have fewer collaborators coauthor networks. Further, that same study found that controlling for coauthor network significantly reduced the publication gender gap [27].
In this study, women less frequently held M.D. or multiple doctoral-level degrees; this observation may reflect gender-based training and career decisions. Women also less frequently published clinical trials. Women more frequently than men published infectious disease research projects as compared to men who focused their energies on cardiovascularrelated research project topics. Women were more frequently based at North American institutions and focused their studies upon US-based clinical populations; this suggests that women may be under-represented in global research as women outside of North America were less likely to be lead authors than women based in North America for original research articles published within these three premiere journals.
final JEM gender-based corresponding author rates, as well as their calculations' details (e.g., gender-based sample sizes, specific analytical tests used, and p-values), were not provided [29].
Nonetheless, limited access to journal-based author information represents a major barrier to advancing our understanding of gender disparities in academic medicine and, more importantly, hinders the ability to resolve them. Even if editorial teams are not the source of bias, they hold some of the keys to progress. Top biomedical research journals are encouraged to follow JEM's lead and increase journal editorial office transparency. Top medical research journals' editorial offices should make their internal author databases (following appropriate de-identification of author records) publicly available for independent analysis or, at the very least, routinely provide published reports evaluating these same types of gender-bias issues, with independent audits to confirm these results.

Conclusions
The Lancet's 2017 recognition [22] -that the time for change is now -was an encouraging, positive step forward and very timely, given the publication gender disparities reported herein. These data also show that persistent and dramatic gender disparities persist, however, and, despite this increased awareness, women first authors appear to continue to face great difficulty in breaking through academic medicine's glass ceiling.
More important questions persist, however. Namely, "why is this?" and "what are we do about it?" Increased transparency among editorial offices will be one step toward answering these questions and increasing accountability; though this issue is certainly more complex, pipeline issues and the role of implicit bias at academic institutions remain areas for investigation. Based on the data reported herein, collaborations between senior women with more junior women researchers is one strategy suggested that may partially improve the future gender balance. Regardless of the cause, a steep uphill climb remains for women who aim to have a successful career in academic medicine.